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Introduction  and  Background 
A  series  of  papers  recently  appeared  on  consistency  of  nonparametric  regression 


function  estimators  and  rates  of  consistency.  (See  Collomb,  1981  for  a  bibliogra¬ 
phic  review).  In  the  present  work  we  obtain  pointwise  rates  of  consistency  by  de¬ 
monstrating  a  law  of  the  iterated  logarithm  for  a  large  class  of  regression  func¬ 
tion  estimators.  The  estimators  we  shall  look  at  are  of  the  following  type: 


(1.1) 


m  (x)  =  n 
n 


-1 


n 

l 

i=l 


K  ,  . (x;  X. ) Y. 

r(n)  l  i 


where  (K^:  rel}  denotes  a  sequence  of  delta  functions  (or  kernel  sequence)  and 
{(X. ,Y.)}  i=l,2,...,n  are  independent  observations  of  a  distribution  with  unknown 
positive  density  f(x,y). 

Most  nonparametric  estimators  of  m(x)  =  E(y|x=x)  are  of  this  form,  for  in¬ 
stance,  the  Nadaraya-Watson  kernel  estimator  (more  generally  delta  function  esti¬ 
mators)  or  orthogonal  polynomial  estimators. 

A  major  result  in  the  theory  of  consistency  of  kernel  type  estimators  has 
been  obtained  by  Collomb  who  gave  necessary  and  sufficient  conditions  for  consis¬ 
tency  of  the  Nadaraya-Watson  kernel  estimate.  For  generalizations  and  related 
work  see  the  bibliographic  review  of  Collomb  (1981)  where  parallel  work  on  ortho¬ 
gonal  polynomials  is  also  presented.  Stone  (1977)  considered  the  estimator  de¬ 
fined  in  (1.1)  and  gave  general  conditions  on  the  weights  Kr(x;  X^  for  m^fx)  to 
x 

be  consistent  in  L  ,  i.e.  for 

E  |mn(x)  -m(x)  |r  -*•  0 


whenever  E | Y | r  <  00  .  Stone,  however,  points  out  that  it  is  not  clear  from  his 
results  when  an  estimator  of  the  Nadaraya-Watson  type,  to  be  discussed  in  section 

T 

4,  is  consistent  in  L  .  In  the  field  of  density  estimation  Wegman  and  Davies 
(1979),  Hall  (1981),  Csorgo  and  Hall  (1982)  have  given  a  law  of  the  iterated 
logarithm  for  different  kinds  of  density  estimators. 


We  begin  by  showing  a  law  of  the  iterated  logarithm  for  the  shifted  estimate 


(1.2)  m  (x)  -  Em  (x)  . 

n  n 

That  is,  we  center  mn(x)  around  its  expectation.  We  could  also  center  it  around 
m(x) ,  the  regression  curve, but  since  the  bias  is  purely  analytically  handled,  it 
suffices  to  look  at  (1.2).  The  handling  with  these  bias  terms  using  different 
smoothness  assumptions  of  m(*)  and  Kr(*)  is  delayed  to  the  sections  where  we  ap¬ 
ply  the  general  result  of  section  2.  In  section  4  we  show  a  law  of  the  iterated 
logarithm  for  the  Nadaraya-Watson  kernel  estimate  with  known  and  unknown  marginal 
density  f^(x)  of  X  and  in  section  5  we  show  a  similar  result  for  estimators  based 
on  orthogonal  polynomials. 

2.  A  law  of  the  iterated  logarithm  for  a  special  triangular  array. 

Let  {X^,Y^)}  be  a  sequence  of  independent  and  identically  distributed  rv's 
with  pdf  f(x,y)  and  cdf  F(x,y)  and  EY  <°°.  As  in  (1.1)  let  {K^:  rel)  be  a  sequenc 
of  real  valued  functions  each  of  bounded  variation  and  define 

Sn(r)  =  J1{Kr(Xi)Yi-E[Kr(Xi)Yi]> 

which  is  actually  a  multiple  of  (1.2)  where  we  omitted  the  design  point  x  for 
convenience.  Define  also 

o(r,s)  =  cov(Kr(X)Y,Ks(X)Y}  and  o2(r)  =  a(r,r)  . 

We  will  establish  conditions  similar  to  Hall  (1981)  and  Csorgo  and  Hall  (1982) 
under  which  Sn(r) ,  r=r(n)el  follows  the  law  of  the  iterated  logarithm.  We  de¬ 
monstrate  that 

limsup  +  [4>(n)]_1S  (r(n))  =1  a.s. 


2  1/2 

where  <j>(n)  =  (2n0  (r)loglogn)  7  .  The  set  (S^fr),  n  >  1}  is,  in  fact,  a  tri¬ 
angular  sequence,  and  in  this  section  it  is  shown  that  under  certain  assumptions 
may  be  approximated  by  a  Gaussian  sequence  with  the  same  covariance  structure 
A  law  of  the  iterated  logarithm  can  then  easily  be  deduced  using  techniques  simi 
lar  to  Hall  (1981) . 

We  shall  also  make  use  of  the  Rosenblatt  transformation  (Rosenblatt,  1952) 
T(x,y)  =  (FY|x,Fx)(x,y) 

transforming  the  original  data  points  {(X. ,Y.) }?_j  into  a  sequence  of  mutually 
independent  uniformly  distributed  over  [0,1] ^  random  variables  {(X! ,Y!) . 

This  transformation  was  also  employed  by  Johnston  (1982)  and  Mack  and  Silverman 
(1982)  to  obtain  strong  uniform  consistency  of  Nadaraya-Watson  kernel  type  re¬ 
gression  function  estimates.  Define 

Vn(un’  ■  ,  ,/  ldKr(n)(x)l  *  I W"?’ 1  •  "21 
I  l~Un 

where  (u  }  is  a  sequence  of  constants  0<u  <°°. 
n  n 

Theorem  1 .  Suppose  that  the  sequence  of  kernels  and  (u^)  satisfy 

(2.1)  an*vnCUn)  =  °(n1/,2CT(r)(loglogn)ly,2/(logn)2)  , 

where  {an)  is  a  sequence  of  positive  constants  tending  to  infinity. 

00  - 

l  0(r)"2(loglogn)“1[E{K2(X)*I(|x|>u  )}]  <  » 
n=3  r  n 

00 

(2.2)  l  o(r)"2(loglogn)'1[E{K2(X)«I(|x|su  ) -Y2-I( |y| >a .)}]  <  «  . 

n=3  r  n  n 

Then  on  a  rich  enough  probability  space  there  exists  a  Gaussian  sequence  (T  } 

n 

with  zero  means  and  the  same  covariance  structure  as  (S  (r)},  and  such  that 

n 

S,(r)  -  T  =  o(n1/20(r)(loglogn)1/2)  a.s. 


m 


The  main  idea  of  the  proof  is  as  in  Hall  (1981)  (for  density  estimators)  and  in 
Hardle  (1983)  (for  regression  estimators)  the  strong  approximation  of  Fn(z)-F(z) 
(density  case)  and  of  F^fx.y) -F(x,y)  (regression  case)  respectively.  Hall  employs 
for  the  case  of  density  estimation  the  results  of  Komlos,  Major,  Tusnady,  (1975) . 
We  will  make  use  of  a  similar  result  (for  the  two  dimensional  case)  by  Tusnady 
(1977).  The  fundamental  connection  between  the  regression  estimator  mn(*)  and 
its  strong  approximation  by  a  Gaussian  process  is  established  by  the  following 
lemma. 

Lemma  1 .  On  a  rich  enough  probability  space  there  is  a  version  of  a  Brownian 
Bridge  B(x',y'),  (x' ,y')€ [0, 1] 2  such  that 

-C  u 

P{sup| e  (x,y) |  >  (Cjlogn+u) logn)  <  C_*e  , 
x,y 

where  Cj,C2,Cj  are  absolute  constants  and 


en(x,y)  *  n[(Fn(x,y)-F(x,y))-B(T(x,y))]  . 


Proof.  This  is  clear  from  Tusnddy  (1977)  and  the  fact  that  n1/,2[Fn(T  1(x',y')  - 
F(T  (x',y'))],  (x'  ,y*)€  [0,1]  is  the  empirical  process  of  (Rosen¬ 

blatt,  1952). 

The  following  theorem  establishes  now  under  regularity  conditions  on  the  co- 
variance  matrix  a(r,s)  that  a  law  of  the  iterated  logarithm  (LIL)  holds  for 
mn(x)  the  regression  function  estimator  as  defined  in  (1.1). 

Theorem  2.  Suppose  that  (2.1)  and  (2.2)  hold  and  that 

*y 

(2.3)  lim  limsup  sup  |a(r(m) ,r(n))/a  (r(n))  -  l|  =  0  , 
e-*-0  n-*»  meT 

n,e 

when  T  „  =  (m:  Im-nUen).  Then 
n,e  1  ' 

limsup  t  [<J>(n) ] "*S  (r)  =1  a.s. 

IV+oo  n 


Condition  (2.3)  is  the  same  as  in  Hall  (1981)  but  with 

o(r., r.)  =  If  y2K  (x)K  (x) f(x ,y)dxdy  instead  of  his  o  =  J  K  (x)K  (x)dx 

x  z  rl  r2  V  2  rl  r2 

in  the  ease  of  estimating  the  uniform  density. 

3.  Proofs. 

To  establish  Theorem  1  we  set 

OO 

Tn  =  nil  Kr(x)ydB(T(x,y))  , 

_0O 

B(x',y')  being  the  Brownian  Bridge  of  Lemma  1  and  show  that  the  difference 

Rn  *  n_1(S  (r)-T  )  =  n  *J7  Yde  (x,y) 
satisfies  n  n  n  r  n 

(3.1)  Rn  =  o(n"1/,2a(r)(loglogn)1^2)  a.s. 

Note  first  that  T  has  the  covariance  structure  ascribed  to  it  in  Theorem  1. 
n 

This  follows  from  the  fact  that  the  Jacobian  J(x,y)  of  T(x,y)  is  J(x,y)  =  f(x,y) , 
the  joint  density  of  (X,Y)  (see  Rosenblatt,  1952)  and  the  following  lemma,  stated 
without  proof. 

Lemma  2.  Let  Gr(x,y)  =  Kr(x)y.  Then 

11  l  11  i 

(Z  ,Z2)  =  (II  G  (T_1(x’,y’))dB(x',y’)  ,  //  G  (T_1(x’ ,y’))dB(x’ ,y')) 

1  00  T1  00  r2 
has  a  bivariate  normal  distribution  with  zero  means  and  covariances 

cov(Z  ,Z  )  =  //  K  (x)K  (x) y2f(x,y) dxdy 
*  rl  r2 

-  [//  K-  (x)yf(x,y)dxdy][//  K  (x)yf(x,y)dxdy] 

1  r2 

=  a(r1,r2)  . 


To  demonstrate  (3.1)  we  split  up  the  integration  regions  and  obtain 


R.  _  =  |n  /  j  K  (x)yde  (x,y) 
|x|Sun|y|knr 


£  vn(un) •2*an*n’1*suP  len(x,y) |  , 

*  »y 

-  l.-l  V  d(2)| 


R2.n  ’  I-’,!,  C 


H'n  *  tMVI(lXil>un)’Vi‘I(lYi|san)1 
-  ErKr(X)I(|x|>un)Y*I(|Y|^an)] 


R3,n  ‘  I*'1  • 


rM=  t  Kr  (X±3  - 1 C 1 X,  1  ^un)  -  y  .  -  i  { I  y.  |  >an)  ] 
-  E[Kr(X)-I(|x|sun)*Y*I(|Y|>an)] 


?  »(4) 


R4,n  =  I" 1  * 


rm  *  [«r(V*'lxiK’'Yi‘I(lY1l>V] 

-  E[Kr(X)-I(|x|>un)*Y«I(|Y|>an)]  , 
r5  „  s  n*1!,  .  ./  K  (x)ydB(T(x,y))|  , 

M>un  |y|san 

R6n=n'1| ,  ,  ,/  K  (x)ydB(T(x,y))|  , 

|x|sun  |y|>an 

R7  n  =  n_1 1  .  /  ,  ,/  K  (x)ydB(T(x,y))|  . 

|x|>un  | y | >an 

From  Lemma  1  we  deduce  that  n-1Sup|e  (x,y)  |  =  Ofn^Clogn) 2)  a.s 


and  so  by  condition  (2.1)  we  conclude  that 

(3.2)  R  =  o(n'1/2o(r) (loglogn) 1/2)  a.s. 

1 ,  n 

f  21 

Next  observe  that  (R^  lsisn  are  independent  and  identically  distributed  random 

i.n 

variables.  We  then  have  by  Markov's  inequality  that  for  any  e>0 

P(n-1|  I  R^  |>e*o(r)n"1/,2*(loglogn)1^2) 
i=l  i.n 

<  e‘2a(r)'2(loglogn)_1‘E(R^2^)2  . 

2 

So  with  the  assumption  EY  <°°  and  condition  (2.2)  it  follows  with  the  Borel-Cantell 
Lemma  that 

(3.3)  R  =  o(n_1^2a(r) (loglogn) *^2)  a.s. 

The  terms  R_  ,  R.  may  be  estimated  in  the  same  way  using  Markov's  inequality 

jll  4  jll 

and  condition  (2.2)  and  we  therefore  have 

R3  n  =  o(n_1^2a(r) (loglogn) *^2)  a.s. 

(3.4) 

R4,n  =  °(n~1/2<7(r)  (loglogn) 1/2)  a.s. 

The  remaining  terms,  R_  ,  R,  and  R_  are  all  Gaussian  with  mean  zero  and 

5,n  o  j  n  /  |ii 

standard  deviations 

feCRpi,2}1'2 

i  f  n 

{E(R<V>1/2 

i  ,n 

(B(Rf4))2)1/2 

I.n 

respectively.  Therefore,  R_  ,  for  instance,  can  be  computed  by 

v  |T1 

P(RS  n>£n’1/2a(r) (logic rn) 1/2) 


•)-(loglogn)1/2/[E(R1(2h2]1/2}]  , 


-8- 


where  $  denotes  the  cdf  of  the  standard  normal  distribution.  A  similar  equality 

holds  for  R,  and  R_  ;  therefore,  we  conclude  in  view  of  condition  (2.2)  and 
6,n  7  ,n 

the  usual  approximations  to  the  tails  of  the  normal  distribution  that 

Rs  n  =  o(n_1^2a(r) (loglogn)1^2)  a.s. 

(3.5)  R6  n  =  °(n ^^^C^floglogn)1^2)  a.s. 

R  =  o(n_1/,2o(r)  (loglogn) 1//2)  a.s. 

/  ,T1 

Finally  the  desired  result  of  Theorem  1  follows  by  putting  together  statements 
(3.2) -(3.5)  respectively. 

The  proof  of  Theorem  2  follows  in  much  the  same  way  as  Theorem  1  in  Hall 
(1981,  p.  49).  We  only  have  to  note  that  lemma  1  in  Hall  (1981,  p.  49)  has  to 
be  replaced  by  (2.3).  Setting  Y=  1  in  all  our  derivations  shows  that  Hall's 
result  follows  from  ours. 

t  Kernel  estimators. 

Two  types  of  kernel  estimates  of  the  regression  function  m(x)  will  be  con¬ 
sidered  here.  The  first  is  due  to  Nadaraya  (1964)  and  Watson  (1964)  and  is  mo¬ 
tivated  by  the  formula 

m(x)  =  (Jyf(x,y)dy}/fx(x)  . 

We  define  the  Nadaraya -Wat son  estimate  as  follows: 

m*(x)  =  (nil)-1  j  KCCx-y/MYi/rOh)-1  j  K((x-X.)/h)  . 
n  i*l  i=l 

Consistency  and  asymptotic  normality  of  m*(x)  were  considered  by  Schuster  (1972), 
Johnston  (1979),  Mack  and  Silverman  (1982)  among  others.  If  the  marginal  density 
fx(x)  is  known,  it  is  appropriate  to  replace  the  density  estimate  in  the  denomina¬ 
tor  of  m*(x)  by  the  true  density  fx(x) .  This  leads  to  the  following  estimate: 


i)  (x)  =  (nh)'1  l  K((x-X.)/h)Y./f(x) 
n  i=i  1  1  K 

considered  by  Johnston  (1979,1982). 

Let  us  define  S2(x)  =  E(Y2|x=x),  V2(x)  =  S2(x)  -  m2(x) ,  and  assume  that 

2 

f^(x) ,  m(x)  are  twice  differentiable  and  S  (x)  is  continuous.  We  assume  further 
that  the  kernel  K(*)  is  continuous,  has  compact  support  (-1.1)  say  and  that 
J1^  uK(u)du  =  0.  This  implies  that  v^fu^)  as  used  in  (2.1)  is  constant  for 
large  enough  u^.  We  will  make  use  of  the  following  assumptions: 

(4.1)  nh^/loglogn  -*•  0  as  n  ->  00 

00 

(4.2)  l  h(loglogn) -1,E[Y2I( Jy |>a  ) ]  <  « 

n=3  n 

where  (an)  is  as  in  (2.1),  (2.2)  such  that 

an  =  o((nh_1loglogn)1/,2/(logn)2)  . 

(4.3)  lim  limsup  sup  |h(m)/h(n)  -  l|  =0  . 
e-H)  n-*°°  meT 

n,e 

We  then  have  the  following  theorem  for  "^(x) • 

Theorem  3.  Under  the  assumptions  above 

—  1/2 
limsup  *  [m  (x) -m(x) ] (nh/21oglogn) 

n-*»  n 

=  [S2(x)J  K2(u)du/fx(x)]1/2  a. s. 

The  Nadaraya-Watson  estimate  follows  also  a  LIL  as  the  following  theorem  shows. 

00 

r  -2  -1 

Theorem  4.  Under  the  assumptions  above  and  I  n  h  <  00 

n=l 

limsup  ±  [m*(x) -m(x)  ]  (nh/21oglogn)  ly^2 
n-*»  n 


=  [V2(x)J  K2(u)du/fx(x)]1/2 


a.s. 


Note  that  the  only  difference  between  Theorem  3  and  Theorem  4  is  the  different 

2  2 

scaling  factor.  Since  in  general  S  (x)>V  (x)  we  may  expect  closer  asymptotic 
confidence  bands  for  m*(x) .  This  observation  has  already  been  made  by  Schuster 
(1972)  and  Johnston  (1982).  This  papers  together  with  Hardle  (1983)  thus  solves 
the  question  raised  by  Johnston  (1982)  whether  one  should  be  able  to  compute 
asymptotic  confidence  intervals  for  m*(x) .  Johnston  derived  (uniform)  confidence 
intervals  for  m^fx)  only. 

Proof  of  Theorem  3.  We  first  show  that  we  coiid  center  m  (x)  around  Em  (x) . 

n  n 

This  follows  from 

Ein  (x)  =  fv(x)  *h  *{  K((x-u)/h)m(u)fY(u)du  =  m(x)  +  0(h2) 
n  x  '  x 

using  the  smoothness  of  m(*)  and  fx(*)  and  the  assumptions  on  the  kernel  K(*) 
(Parzen,  1962;  Rosenblatt,  1971). 

From  assumption  (4.1)  it  thus  follows  that  the  bias  term  (Emn(x) -m(x)) 
vanishes  of  higher  order.  So  it  remains  to  show  that 

(4.4)  limsup  ±  [mn(x)-Em  (x) ] /(nh210glogn) *^2 
n**°° 

=  [S2(x) *fx(x) /  K2(u)du]1^2  a.s. 
n  n 

where  mn(x)  =  £  K((x-X.)/h)Y.  =  l  Kh(X.)Y.  . 

i=l  i=l 

From  the  assumptions  on  the  kernel  K( •)  we  conclude  that  ^n(u)  =  h  JK(u/h) 
is  a  delta  function  sequence  (DFS)  (Watson  and  Leadbetter,  1964) .  We  make  now 
use  of  this  general  approach  in  terms  of  DFS's  and  obtain  the  following: 

h  •  a2(h)  =  h/  <52  (x-u)S2(u)  fx(u)du-h[J  6n(x-u)m(u)fx(u)du]2 
■+•  S2(x)»fx(x)/  K2(u)du  as  n  +  »  . 

2 

This  follows  from  Watson  and  Leadbetter  (1964)  by  noting  that  S  (»)f  (•)  is 


m 
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continuous  and  (h(J  K  )  *<Sn(u)}  is  itself  a  DFS.  The  use  of  this  DFS-technique 

would  also  considerably  simplify  Hall’s  proof  (1981)  for  Rosenblatt-Parzen  kernel 
density  estimates. 

To  establish  (4.4)  with  the  use  of  theorem  2  we  have  to  show  that  (2.3)  holds. 
We  must  thus  demonstrate  that  if  h,k->-0  such  that  h/k-»-l  (in  view  of  assumption 
(4.3)),  then 

(4.5)  h_1cov(K((x-X)/h)Y  ,  K((x-Y)/k)Y}  -*•  1 

But  EK((x-X)/h)Y  =  hj  6  (x-u)m(u)  *fy(u)du  =  o(h+^2),  and  so  by  the  computations 

n  a 

2 

for  a  (h)  above  it  remains  to  demonstrate  that 

h'1/  [K((x-u)/h)-K((x-u)/k)]2S2(u)fx(u)du 0  . 

2 

From  the  boundedness  of  S  (•)  and  f^C*)  it  is  clear  that  the  integral  above  is 
dominated  by 

M J  [K(u)-K(uh/k)]2du  . 

The  kernel  K  is  continuous  and  so  K(uh/k)  -*•  K(u)  a.e.  and  it  follows  that  (4.5) 
holds. 

Assumption  (2.1)  follows  from  (4.2)  since  K(*)  has  compact  support  and  thus 

2 

vn(un)  =  const,  for  n  large  enough.  In  view  of  the  asymptotic  formula  for  a  (h) 

above  we  have  by  assumption  (4.2) 

2  1/2  2 
an  =  o( (no  (h) loglogn)  '  /(logn)  ) 

which  is  assumption  (2.1).  Finally,  assumption  (2.2)  follows  immediately  from 

2  -1 

(4.2)  since  K  has  compact  support  and  as  above  o  (h)  ~  h  .  Theorem  3  thus  fol¬ 
lows  from  theorem  2. 

Proof  of  theorem  4.  To  prove  theorem  4  we  decompose 

m*(x)  -  m(x)  =  [(nh)_1mn(x)-m(x)fn(x)]/fx(x) 

♦  fx2(x)  [m*(x)  -m(x)  ]  •  [fx(x)  -f^x)  ] 


where  f  (x)  =  (nh)  £  K((x-X.)/h)  is  a  density  estimate  of  f  (x)  .  Now  from 

n  i=l  1  x 

Hall  (1981),  Theorem  2  it  follows  that 

(4.6)  limsup  +  ff^(x) -fx(x)  ](nh/210j?logn) ly^2 

IT*00 

=  [fx(x)J  K2(u)du]^2  a.s. 

2 

if  we  use  assumption  (4.1)  which  ensures  that  the  bias  (Ef  (x)-f  (x))  =  0(h  ) 

n  x 

vanishes.  Note  that  Hall's  assumption  (11)  is  not  necessary  here  since  we  assume 

2  1 

that  K( •)  has  compact  support.  From  Noda  (1976)  we  conclude  that  Jn  h  <  00 

makes  m*(x)-m(x)=o(l)  a.s..  This  and  (4.6)  thus  yield  that  the  second  term  on 

n  1/2 
the  RHS  of  the  decomposition  above  is  O'f  order  o([nh/21oglogn)  )  a.s. 

The  first  summand  of  the  decomposition  above  can  be  written  as 

(nh)-1(m-Em)/fx  +  ((nh)  ^Em-mf^/^  -  m(fn-Efn)/fx  +  m(fx-Efn)/fx 

As  in  the  proof  of  theorem  3  it  follows  by  assumption  (4.1)  that  the  bias  terms 
((nh)  ^m-mf^  and  (Efn~fx)  vanish.  It  remains  to  show 

(4.7)  (nh) -1(m-Em)  -  m(fn-Efn) 
follows  the  LIL,  i.e. 

limsup  +  [(nh)  ^m-Em)  -  m(f  -Ef  )  ]  (nh/21oglogn)  *^2 
n-*®  n  n 

=  [V2(x)*fx(x)*J  K2(u)du]1/2  a.s. 

This  can  be  deduced  from  theorem  2,  if  we  rewrite  (4.7)  as 
.  n  n 

(nh)-1  nKh(Xi)Yi_EKh(X)Yl  "  m(x)(nh) n,h(xi)-EKh(X)] 

l  n 

=  (nh)"1  ^{^(X.JtY.^x)]  -  EKh(X)[Y-m(x)]}  . 


Next  we  show  that  (4.3)  holds.  The  variance  for  the  sequence  above  is  now: 
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h*cr2(h)  =  h»J  S2(x-u) [S2(u)-m2(x)]fy(u)du 

U  A 

-  h [ /  6n(x-u) [m(u)-m(x)]fx(u)du]2 
■+■  V2(x)*f^(x)J  K2(u)du  as  n-*»  . 

As  above  in  the  proof  of  theorem  3  we  conclude  that  (2.3)  holds.  Theorem  4  thus 
follows  from  theorem  2. 


5.  Orthogonal  polynomial  estimators. 

Estimators  of  the  regression  function  m(x)  based  on  orthogonal  polynomials 
fit  also  in  the  general  framework  developed  in  the  first  section.  We  define  the 
estimate  based  on  a  system  of  orthonormal  polynomials  on  [-1,1]  as  follows: 

,  n  .  n 

m  (x)  *  n"1  T  K  (x;X.)Y./n-1  7  K  (x;X.) 
n  ."j  m  ’  l  m  l 

where  m  =  m(n)  tends  with  n  to  infinity  and 


III 

(x;  X.)  =  7  e.(x)e.(X.) 
>  1  j=0  3  3  1 


and  {e^(«)>  is  the  orthonormal  system  of  polynomials. 


In  the  case  of  a  known  marginal  density  fx(x)  we  consider 


m'Tx)  =  n"1  l  K  (x;  X  )Y  /f  (x) 
n  m  l  i  X 

2  2 
As  in  section  4  let  S  (x)  be  the  second  conditional  moment  of  Y  and  V  (x)  the 

conditional  variance  respectively.  We  further  assume  that 

fx(x)  has  compact  support  in  (-1,1) 

2  -1/4 

(1-x  )  '  fx(x)  is  integrable  on  (-1,1). 
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We  consider  only  the  case  of  e..(*)  =  p^(*)  =  orthonormal  Legendre  polynomials 
here  and  assume  that  the  following  holds: 

(5.1)  lim  limsup  sup  |m(p)/m(n) -1 |  =  0 

e-H)  n+°°  peT  „ 

n,  e 

oo 

(5.2)  l  m'1-(loglogn)“1E(Y2-I(|Y|>an))  <  •  . 
n=3 

when  (an)  is  as  in  (2.2),  (4.2)  a  sequence  of  constants  tending  to  infinity  such 
that 

aR  =  o(n1/,2m(loglogn)1/,2/(logn)2)  . 

(5.3)  n/(m^loglogn)  -*•  0  as  n  -*■  00  . 

We  have  then  the  following  theorem  for  m^(x)  and  mfl(x) . 


Theorem  5.  Under  the  assumptions  above 

1/2 

limsup  ±  [m^(x)-m(x)](n/2mloglogn) 
n*°° 


-  [S2(x)/(fx(x)-ir)]1/2(l-x2)"1/4  a.s. 


and 


limsup  +  [m^fx) -m(x) ] (n/2mloglogn) 
n+«  11 


1/2 


=  [V2(x)/(fx(x)  •tt)]1^2  (1-x2)‘1/4  a.s. 


Proof.  We  first  show  that  the  LIL  for  ra'(x).  The  second  assertion  will  then 

n 

follow  as  theorem  4  from  theorem  3.  As  in  theorem  3  we  show  first  that  the  bias 
(Em^(x)-m(x))  is  negligible. 

Em'(jc)  =  [f  (x)]_1*EK  (x;  X)Y 
n  x  m 


=  [^(x)]'1/  Km(x;  u)m(u)fx(u)d 


=  m(x)  +  0(m~2) 
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y  a  slight  modification  of  the  argument  proving  theorem  1  in  Walter  and  Blum 
(1979).  By  the  same  arguments  as  in  Hall's  (1981)  proof  of  his  theorem  3  (p.  60) 
we  conclude  that 

a*  ~  E[K2(x;  X)Y2]  ~  m  •  S2(x)/(  [fx(x)TT]  (1-x2) 1/2)  . 

Assumption  (2.1)  follows  now  from  (5.2)  and 
J|dKm(x;  u) |  =  0(m2)  . 

Assumption  (2.2)  follows  also  from  (5.2)  so  we  finally  derive  the  desired  result 
from  theorem  2,  since  (2.3)  may  be  proved  as  in  theorem  3  using  (5.1). 
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